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ABSTRACT 

Analysis  o£  variance  (ANOVA)  Is  often  used  in  quality  control 
studies.  It  assumes  equal  variabilities  within  groups,  and  no  exact 
procedures  have  been  available  for  cases  with  unequal  variabilities. 
In  this  paper  exact  procedures  are  given  and  Illustrated.  An  Indica¬ 
tion  of  the  losses  to  he  Incurred  by  using  the  traditional  F-test 
when  variances  are  unequal  is  given. 


INTRODUCTION 

The  statistical  analysis  of  variance  technique  (ANOVA)  is  often 
used  in  various  experimental  designs  in  quality  control  studies. 

Dr.  Dudewicz  is  Professor  (Department  of  Statistics)  at  The  Ohio 
State  University,  Columbus ,  Ohio.  Dr.  Bishop  is  -■  research  scientist 
for  the  Battelle  Columbus  Laboratories,  Columbus,  Ohio.  A  previous 
version  of  this  paper  was  presented  at  the  1977  ATC  lx.  Philadelphia 
and  appeared  in  the  ATC  Transactions. 

This  research  was  supported  by  Office  of  Naval  Research 
Contract  No.  NOOOI4-78-C-O5U3. 
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For  example.  It  is  used  In  studies  where  one  wishes  to  determine  the 
significance  of  the  variables  under  experiment,  as  in  the  paper  and 
pulp  industries  where  in  bleaching  studies  the  effects  of  teo$>erature , 
type  of  bleaching  chemical,  pH  levels,  and  consistency  are  investigated 
as  to  their  effects  on  pulp  brightness,  and  in  response  surface  method¬ 
ology  for  evaluating  the  fit  of  models.  This  analysis  assumes  that  the 
observations  are  normally  distributed,  and  that  the  variability  of 
results  within  a  treatment  is  the  same  for  every  treatment. 

While  experimenters  are  often  cautioned  that  "the  assumption  of 
equal  variability  should  be  Investigated"  (e.g.,  see  page  91  of  Cochran 
and  Cox  [3]  or  page  46  of  Section  27  of  Juran,  Gryna,  and  Bingham  [7]), 
no  exact  statistical  procedures  have  been  available  for  dealing  with 
cases  where  one  finds  that  variabilities  are  in  fact  unequal.  While 
variance-stabilizing  transformations  and  other  approximate  methods  have 
existed  for  many  years,  most  experimental  situations  are  such  that  the 
problem  is  far  from  solved  by  these  approximate  methods.  For  example, 
such  methods  misallocate  sample  size  by  taking  the  same  sample  size 
from  a  treatment  with  relatively  small  variability,  as  from  a  treat¬ 
ment  with  relatively  large  variability,  even  though  the  need  for  obser¬ 
vations  on  the  latter  is  substantially  greater  and  they  have  a  greater 
beneficial  effect  on  performance  characteristics  of  the  overall  analy¬ 
sis.  Also,  such  methods  provide  only  rough  estimates  and  confidence 
intervals  on  the  parameters  of  interest,  the  parameters  of  the  original 
problem  before  a  transformation  is  applied. 

In  this  paper  we  give  exact  procedures  which  we  have  recently 
developed  for  ANOVA  when  treatment  variabilities  differ.  The  proce¬ 
dures  are  illustrated  on  typical  quality  control  situations,  with 
explicit  attention  being  given  to  the  level  and  power  of  the  test. 


Recommendations  are  given  as  to  when  one  should  abandon  Che  common 
ANOVA  procedures  in  favor  of  these  new  ones,  with  an  indication  of  the 
costs  one  nay  incur  by  not  doing  so. 

NEW  ANOVA  PROCEDURES 

We  will  describe  the  new  procedures  in  the  context  of  the  one-way 

layout;  similar  procedures  are  available  [2]  for  higher-way  layouts. 

In  the  one-way  layout,  X^  is  the  j  observation  on  the  i  treatment 

(i  -  l,2,...,k),  it  is  assumed  that  the  X^'s  are  Independent  and  nor- 

2 

mally  distributed  with  mean  ECX^)  “  y^  and  variance  Var(X^)  - 

2  2 

where  -  «  <  y^  <  +  °»  and  0  <  a^,  but  y^  and  are  otherwise  unknown, 
and  the  goal  (purpose  of  the  experimentation)  is  to  make  inferences 
about  y1,y2, . . . ,y^,  which  often  represent  average  process  yields.  For 
axample,  we  might  want  to  test  the  null  hypothesis 

H0  :  U1  "  W2  “  **•  "  ^  Cl) 

that  the  treatments  do  not  produce  different  mean  yields.  In  classical 

2  2  2 

ANOVA  procedures  it  is  also  assumed  that  CT^  ”  a2  "  • * ■  “  but  we  do 
not  make  this  assumption. 

Our  procedure  for  this  problem,  which  we  call  Procedure  P^,  is  as 

follows:  Choose  a  nuriber  z  >  0  (this  number  is  related  to  the  power  of 

the  test,  and  how  to  choose  it  will  be  discussed  later) ,  and  take  an 

initial  sample  of  size  from  each  of  the  k  treatments  or  processes. 

Any  integer  sample  size  n^  ^  2  will  work,  but  values  n^  12  will  give 

th  2 

the  best  results.  For  the  i  process  let  denote  the  usual  unbiased 
2 

estimate  of  based  on  the  first  n^  observations,  and  define 


4 


Ni  "  ■“  {ao  +  x’  [t]  +  x} 


where  [x]  denotes  the  greatest  Integer  less  than  x  (e.g.  [5.3]  *  5). 
Then  take  -  n^  additional  observations  from  the  iCh  process  so  we 
have  a  total  of  observations  fron  the  itb  process;  recall  that  these 
observations  are  denoted  by  X^,  X12‘-  .  Now  compute 


V  “  Z  aiXij  +  E  Vij 

1  j-1  j-nQ+l  13 


where 


i{1+ JVW} 

i  '  ("rVi 


l-(N1-n0)bi 


Then  compute  the  test  statistic 


~  ^  Z.  Z.  2 

F  -  I  (X  .-X..)  tz 

1-1 


where 


X..  - 


l  X  . 
i-1 


and  reject  HQ  :  y  -  m  •••  ■  If  and  only  if 


F  >  F(a;k,nQ) 


where  F(a;k,ng)  is  the  upper  100a  percent  point  of  the  distribution  of 
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Q  -  E  (t.-t)  when  t.. , . . .  ,t.  are  Independent  Student 's-t  random  vari- 
i-1  1  IK 

ables  with  n^-l  degrees  of  freedom  and  t  ■  (t^  +  ...  +  t^)/k. 

We  will  now  discuss  the  choice  of  z  and  tables  of  P(q;k.nQ).  The 

level  and  power  of  the  new  test  do  not  depend  on  the  unknown  variances 
2  2  2 

Cfl,02*  *  ’  *  ,<Jk*  ^Ut  rather  only  on  ®q»  and  z.  Thus  z  >  0 

* 

should  be  chosen  so  that  one  has  the  desired  power,  say  P  ,  at  a  given 
alternative.  Exact  tables  needed  for  this  purpose  have  been  given  In 
[1].  However,  as  long  as  n^  Is  not  very  small  a  simple  approximation 
Is  available:  one  may  act  as  if  the  test  statistic  F  of  equation  (6) 
has  the  same  distribution  as 


V 


1 

I 


(9) 


2 

where  Is  a  noncentral  chi-square  random  variable  with  k-1 

degrees  of  freedom  and  noncentrallty  parameter  (using  the  distributional 
form  given  by  Johnson  and  Rotz  [6]) 


A  -  E  (y  -  y)2/z  (10) 

i-1 

where  y  -  (y^  +  ...  +  y^/k. 

A  simple  method  of  interpreting  A  is  as  follows.  If  the  experi¬ 
menter  specifies  the  minimum  range  between  the  largest  y^  and  the 

smallest  y^  which  he  wishes  to  detect  as  6  units,  then  whenever 

2 

msx(y^, . . . ,y^)  -  min(y^, . . . ,y^)  21  fi  we  have  A  6  / (4z) .  One  can  then 

*  2 

choose  z  to  attain  power  P  when  A  ■  6  /  (4z) ,  which  occurs  when 

yx  -  -6/2,  y2  -  ...  -  \  *  f/2* 


6 


From  this  point  a  numerical  example,  given  in  the  next  section, 
is  the  easiest  way  to  show  very  simply  how  one  proceeds,  step  by 
step,  in  practice. 

NUMERICAL  EXAMPLE  IN  QC 

Suppose  we  wish  to  test  the  hypothesis  that  4  different  bleaching 
chemicals  are  equivalent  in  their  effects  on  pulp  brightness.  Suppose 
we  decide  to  take  initial  samples  of  size  10  with  each  treatment,  want 
only  a  5Z  chance  of  rejecting  Hq  if  in  fact  Hq  is  true,  and  want  an 
85Z  chance  of  rejecting  HQ  if  the  spread  among  is  at  least 

4.0  units.  We  then  proceed,  step  by  step,  as  follows. 

Step  1.  (Problem  specification.)  Here  k  *■  4  sources  of  observations 
are  available,  we  desire  an  a  -  .05  level  test  of  HQ  :  ■  U2  "  h3  “ 

y^»  if  the  spread  among  is  6  *  4.0  units  or  more  we 

desire  power  (probability  of  then  rejecting  the  false  hypothesis  Hq)  of 

* 

at  least  P  -  .85. 

Step  2.  (Choice  of  procedure.)  Assuming  we  do  not  know  that 
2  2  2  2 

“  °2  m  °3  ■  only  procedure  given  in  this  paper  can  guarantee 
the  specifications  outlined  in  Step  J  above.  It  requires  we  sample  n^ 
observations  in  our  first  stage,  and  recommends  n^  be  at  least  12 
(though  any  n^  _>  2  will  work).  Suppose  the  experimenter  only  wants  to 
invest  40  units  in  first-stage  experimentation  and  sets  ”  10. 

Step  3.  (First  stage.)  Draw  ■  10  independent  observations  from 
each  source,  with  results  as  in  Table  1. 


Tabic  1.  First  Stage  Samples  of  Pulp  Brightness 


Chemical  1 

Chemical  2 

Chemical  3 

Chemical  4 

77.199 

80.522 

79.417 

78.001 

74.466 

79.306 

78.017 

78.358 

82.746 

81.914 

81.596 

77.544 

76.208 

80.346 

80.802 

77.364 

82.876 

78.385 

80.626 

77.554 

76.224 

81.838 

79.011 

75.911 

78.061 

82.785 

80.549 

78.043 

76.391 

80.900 

78.479 

78.947 

76.155 

79.185 

81.798 

77.146 

78.045 

80.620 

80.923 

77.386 

Step  4.  (Analysis  of  first  stage  data.)  He  now  calculate  the  first 

2  2  2  2 

stage  sample  variances  s^.s^.s^.s^,  the  total  sample  sizes  needed  from 

the  four  sources  and  the  factors  a1,a2,a3,a^,b1>b2,b3»b/i 

2 

to  be  used  in  the  second  stage  analysis.  The  s^'s  are  given  In 
Table  2,  along  with  the  other  quantities.  Here  Is  calculated  from 
(2) ,  b^  from  (4) ,  and  a^  from  (5) .  The  z  needed  in  (2)  is  found  as 
follows. 

* 

We  desire  power  P  -  .85  (Step  1  above)  when 


6^  _  (4.0) 2  _  4.0 
4z  4z  z 


(11] 


To  set  z  for  this  power  requirement!  we  first  need  to  know  "When  do  we 
reject?".  From  (8)  we  know  we  will  later  reject  Hq  if  F  >  F(.05;4,10) 
where ,  approximately , 


8 


The  7.81  is  Che  value  a  central  chi-aquare  rand on  variable  with 
It  -  1  *  4  -  1  =3  degrees  of  freedom  exceeds  with  probability  a  *  .OS 
(see  standard  tables,  e.g.,  p.  137  of  Fearsou  and  Hartley  [8]  or 
p.  459  of  Dudetri.cz  £3]). 

The  power  will  be,  approximately, 

P[X3&>  >  7.81]  -  .85  (13) 

if  (see  p.  53  of  the  tables  in  [5]) 

A  -  12.301  ,  (14) 


so  (using  equation  (11)) 


z 


4.0 

12.30 


.325  . 


(15) 


Table  2.  Analysis  of  First  Stage 


Chemical  1 

Chemical  2 

Chemical  3 

Chemical  4 

“o 

10 

10 

10 

10 

Sample  Mean 

77.837 

80.580 

80.122 

77.625 

2 

si 

7.9605 

1.8811 

1.7174 

.6762 

z 

.325 

.325 

.325 

.325 

Ni 

26 

11 

11 

11 

bi 

.046 

.364 

.390 

.686 

ai 

.026 

.064 

.061 

.031 

Step  5.  (Second  stage.)  Draw  -  n^  observations  from  source  i 
(i  *  1,2, 3, 4),  yielding  Table  3. 


Chemical  2 


Chemical  3 


Chemical  4 


82.549 

78.970 
78.496 
78.494 

80.971 
80.313 
76.556 
80.115 
78. £59 
77.697 
80.590 
79.647 
82.733 
80.522 
79.098 
78.905 


Step  6.  (Final  calculations 
of  (6) ,  and  find 

-  78.856,  X  -  80 


Step  7.  (Final  decision.) 
we  reject  the  null  hypothesl 


effects  on  pulp  brightness. 
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point  estimates,  confidence  intervals,  and  selection  procedures  which 
guarantee  a  desired  probability  of  correct  selection  are  not  available 
for  the  basic  parameters  of  Interest  if  one  uses  the  traditional  AHOVA 
after  a  transformation  of  the  data,  but  they  are  for  our  Procedure  P^. 
While  we  cannot  discuss  this  in  detail  here,  it  should  be  borne  in  mind 
that  the  new  methods  are  backed  up  by  an  extensive  statistical  arsenal 
of  procedures  for  goals  other  than  testing  which  one  might  be  interested 
in. 


LOSSES  INCURRED  BY  NOT  USING  THE  NEW  PROCEDURE 

In  our  example  with  k  *  4  different  bleaching  chemicals,  suppose 

the  new  procedure  were  not  used,  but  rather  that  the  traditional  ANOVA 

procedure  were  used.  If  the  samples  taken  were  n^  ■  6,  n^  "  60, 

n^  *  80,  »  10  observations  from  treatments  1,  2,  3,  4  respectively, 

the  traditional  P-test  would  reject  Hq  if  ita  F  value  exceeded  2.74. 

2  2  2  2 

However  while  this  yields  a  level  of  .05  if  if  the 

variances  differ  then  the  level  can  be  greatly  different.  E.g.,  if 
Oj  *  3,  a 2  *  1,  Oj  ■  1,  *  1,  the  true  level  will  be  .134  (almost 

three  times  the  desired  .05  level. . .meaning  13.4Z  of  the  time  one  will 
decide  bleaching  chemical  has  an  effect  on  pulp  brightness  when  in  fact 
it  has  no  such  effect).  However  if  1  -  2,  *  2,  *  3,  the 

true  level  will  be  ,040  (below  the  desired  .05  level)  with  the  tradi¬ 
tional  F-test . 

The  F-test  has  similar  problems  with  its  power.  For  example, 

—  7 

while  its  power  at  Ky^-y)  ■  1.0  is  .459  when  m  m  ® 3  “ 

it  is  .261  when  -3,  o2  -  1,  ■  1,  -  1,  and  it  is  merely  .076 

when  ■  1,  ■  1,  ■  1,  ct^  -  4.  This  means  one  can  have  no 
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certitude  of  rejecting  Hq  when  It  Is  false  If  one's  treatments  have 
unequal  variabilities  and  one  uses  the  F-test. 

Since  the  new  procedures  yield  the  desired  level  and  power  whether 
the  variances  are  equal  or  not,  and  since  sizable  losses  can  be  incur¬ 
red  by  continuing  to  use  the  old  procedures  when  one  has  unequal 
variances,  use  of  the  new  methods  is  strongly  recommended. 
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